Data-driven calibration of linear estimators with minimal penalties
نویسندگان
چکیده
This paper tackles the problem of selecting among several linear estimators in nonparametric regression; this includes model selection for linear regression, the choice of a regularization parameter in kernel ridge regression or spline smoothing, and the choice of a kernel in multiple kernel learning. We propose a new algorithm which first estimates consistently the variance of the noise, based upon the concept of minimal penalty which was previously introduced in the context of model selection. Then, plugging our variance estimate in Mallows’ CL penalty is proved to lead to an algorithm satisfying an oracle inequality. Simulation experiments with kernel ridge regression and multiple kernel learning show that the proposed algorithm often improves significantly existing calibration procedures such as 10-fold cross-validation or generalized cross-validation.
منابع مشابه
Calibration and empirical Bayes variable selection
For the problem of variable selection for the normal linear model, selection criteria such as , C p , and have fixed dimensionality penalties. Such criteria are shown to correspond to selection of maximum posterior models under implicit hyperparameter choices for a particular hierarchical Bayes formulation. Based on this calibration, we propose empirical Bayes selection criteria that...
متن کاملRademacher penalties and structural risk minimization
We suggest a penalty function to be used in various problems of structural risk minimization. This penalty is data dependent and is based on the sup-norm of the so called Rademacher process indexed by the underlying class of functions (sets). The standard complexity penalties, used in learning problems and based on the VCdimensions of the classes, are conservative upper bounds (in a probabilist...
متن کاملPENSE: A Penalized Elastic Net S-Estimator
Penalized regression estimators have been widely used in recent years to improve the prediction properties of linear models, particularly when the number of explanatory variables is large. It is well-known that different penalties result in regularized estimators with varying statistical properties. Motivated by the analysis of plasma proteomic biomarkers that tend to form groups of correlated ...
متن کاملData-driven Calibration of Penalties for Least-Squares Regression
Penalization procedures often suffer from their dependence on multiplying factors, whose optimal values are either unknown or hard to estimate from the data. We propose a completely data-driven calibration algorithm for this parameter in the least squares regression framework, without assuming a particular shape for the penalty. Our algorithm relies on the concept of minimal penalty, recently i...
متن کاملCalibration Weighting to Compensate for Extreme Values, Non-response and Non-coverage in Labor Force Survey
Frame imperfection, non-response and unequal selection probabilities always affect survey results. In order to compensate for the effects of these problems, Devill and Särndal (1992) introduced a family of estimators called calibration estimators. In these estimators we look for weights that have minimum distance with design weights based on a distance function and satisfy calibration equa...
متن کامل